Hierarchical Topic Structuring: From Dense Segmentation to Topically Focused Fragments via Burst Analysis

نویسندگان

  • Anca-Roxana Simon
  • Pascale Sébillot
  • Guillaume Gravier
چکیده

Topic segmentation traditionally relies on lexical cohesion measured through word re-occurrences to output a dense segmentation, either linear or hierarchical. In this paper, a novel organization of the topical structure of textual content is proposed. Rather than searching for topic shifts to yield dense segmentation, we propose an algorithm to extract topically focused fragments organized in a hierarchical manner. This is achieved by leveraging the temporal distribution of word re-occurrences, searching for bursts, to skirt the limits imposed by a global counting of lexical reoccurrences within segments. Comparison to a reference dense segmentation on varied datasets indicates that we can achieve a better topic focus while retrieving all of the important aspects of a text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

Prosody and topic structuring in spoken dialogue

Prosody is critical in conveying topic coherence and the salience of information in speech. In this study we propose that the overall coherence is brought about through pitch level structuring of phrases at both the local level of hierarchical phrase unit positioning and the global level of pitch baseline rise and fall as climax and resolution. Our results show that prosody has critical importa...

متن کامل

IRISA at MediaEval 2015: Search and Anchoring in Video Archives Task

This paper presents our approach and results in the Search and Anchoring in Video Archives task at MediaEval 2015. The Search part aims at returning a ranked list of video segments that are relevant to a textual user query. The Anchoring part focuses on the automatic selection of video segments, from a list of videos, that can be used as anchors to encourage further exploration within the archi...

متن کامل

A Statistical Model for Topic Segmentation and Clustering

This paper presents a statistical model for discovering topical clusters of words in unstructured text. The model uses a hierarchical Bayesian structure and it is also able to identify segments of text which are topically coherent. The model is able to assign each segment to a particular topic and thus categorizes the corresponding document to potentially multiple topics. We present some initia...

متن کامل

A hierarchical Convolutional Neural Network for Segmentation of Stroke Lesion in 3D Brain MRI

Introduction: Brain tumors such as glioma are among the most aggressive lesions, which result in a very short life expectancy in patients. Image segmentation is highly essential in medical image analysis with applications, particularly in clinical practices to treat brain tumors. Accurate segmentation of magnetic resonance data is crucial for diagnostic purposes, planning surgical treatments, a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015